Online Courses from Harvard and MIT Data Exploration¶

By :-¶

Hassan Hosny¶

Investigation Overview¶

We will try to investigate the Online Courses from Harvard and MIT data and try to understand which of two Institution(MIT, Harvard) is the best on edx.

Dataset Overview¶

This data set contains report provides data on 290 Harvard and MIT online courses, 250 thousand certifications, 4.5 million participants, and 28 million participant hours on the edX platform since 2012.

Note that the above cells have been set as "Skip"-type slides. That means that when the notebook is rendered as http slides, those cells won't show up.

Which Institute has more courses on edx website ?¶

We can see the MIT has 161 courses which is more than Harvard Courses

How many courses Provided by each instructors ?¶

We can see that the first two instructors are peter Bol and Bill Kirby in which they provide 20 course and so on at the rest of the data.

How many person of participants per Institution ?¶

We can see that MITx has 2.101121 M Participants which is more than HarvardX Participation which has 2.348736 M

What is the distribution of the Certified participants for each Institute?¶

We can see that the Certified participants follow Exponential Distribution

Is there a relationship between features ?¶

We can see that the Certified has a moderate positive relationship with the particpants who accessed 50% of course,the posted in forums and the particapants who Audited the course.

Which Institution has the Most Certified participants ?¶

We can see the Certified participants from the MITx more than HarvardX in all years except 2014.

Which Subjects does participants interested in over the years for each institution ?¶

We can see the participants interested in HarvardX courses which subjects are CS and Religion,Humanities, and MITx courses that their subjects are Science,Technology and Government enrolled by participants more than Standford.

Animation plot¶

How many Audited Participants per year ?¶

How many participants have been Certified from each course subjects per the years ?¶

Which course subjects was males and females care about the most ?¶

  • We can see that males was interested in Computer Science than any other field and few of them was enrolled in Humanities and Education.
  • While Females didn't seem that they were biased to any of these fields,so we can see that the 4 subjects close to each other but the most subject was Humanities

Is the participants who has Bachelor's degree or higher will be interested in specific subject or not ?¶

We can see from the violin plot that those particpants that interested in Government and Humanities more than Engineering and CS and we can conclude the opposite which is the participants which are still students interested in CS and Technology fields.

Did the participants Completed their course ?¶

  • We can see their is a huge difference between the number of participants and the number of Certified as the maximum number of Certified participants from the course doesn't exceed 6000 while the number of enrollments increase exponentially

  • And with the boxplots and rug distribution in margins we can see the 2 Institutions are familiar except some outliers in HarvardX

Does the length of the course affect the forum discussions and certifications ?¶

We can see the more hours of the course,the more discusson in the forums may be because the course gets more complex so they need to ask questions about the topic,and the courses that has more hours has certified participants more than less length

Thanks for your attention...